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Abstract 


Searching  is  a fundamental  operation- of  computer  science.  Yet  a 
number  of  key  mathematical  questions  about  searching  in  Euclidian  spaces 
retains  open.  A number  of  such  questions  are  formulated  and  answered  here 
for  searching  lines  in  the  plane.  Relationships  between  the  results  here 
and  higher  dimensional  analogs  for  other  problems  of  interest  are  given. 
Among  the  new  results  is  a mathematical  framework  in  which  questions  about 
searching  can  be  stated  in  a more  uniform  manner  than  was  possible  before. 
Specific  results  are  also  given  on  the  searching  complexity  of  various  sets 
of  lines  in  the  plane.  In  particular,  we  show  that  there  are  easy  and  hard 
r*ts  of  lines  to  search  and  establish  methods  of  generating  upper  and  lower 
bounds  on  the  search  complexities  of  such  sets. 
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I.  Introduction 

A fundamental  operation  of  computer  science  is  searching.  Certainly 
the  majority  of  actual  computation  involves  the  processing  and  organization 
of  data  into  sets  which  are  to  be  sorted  in  a manner  to  make  repeated 
searches  as  simple  as  possible.  Furthermore,  Knuth  [5]  has  devoted  an 
entire  chapter  of  his  encyclopedic  work  on  computer  programming  to  the 
study  of  methods  of  computer  searching.  Despite  this  enormous  focus  on 
searching,  a number  of  key  mathematical  issues  regarding  searching  remain 
either  unexplored  or  unanswered.  Among  these  issues  is  the  key  issue  of 
the  searching  of  a set  of  geometric  objects  in  Euclidian  space.  In  addition 
to  the  existance  of  such  problems  as  extensions  and  embellishments  to 
previously  studied  problems  of  geometric  complexity  (see  c.g.  [2],  [9]), 
this  frarrpwnrlt  appears  to  be  a natural  setting,  for  th°  generation  of  lower 
bounds  on  the  knapsack,  partition  and  travelling  salesman  problems  as  well 
as  variants  of  the  sorting  problem.  Furthermore,  this  methodology  has  also 
produced  many  good  upper  bounds  which  can  be  used  to  solve  practical  problems 
of  such  diverse  areas  as  information  retrieval,  numerical  analysis,  and 
artificial  intelligence.  The  main  goal  of  this  paper  will  be  to  lay  the 
beginnings  of  a unified  framework  through  which  all  questions  of  geometric 
searching  can  be  resolved.  To  give  an  idea  of  the  complexity  of  such  a 
theory  we  pause  to  give  an  example  of  an  elementary  result  within  this 
theory  which  appears  very  anomalous.  Consider  the  problem  of  determining 
membership  of  a point  on  or  among  a set  of  n lines  in  the  plane  which  are 
in  general  position.  That  is,  we  are  given  a set  of  n lines  in  the  plane 
with  the  condition  that  no  three  have  a point  in  common  and  each  pair  has 
exactly  one  point  in  common  (i.e.  no  two  are  parallel).  We  then  wish  to 


ask  questions  about  a new  point  determining  at  eacli  query  whether  it  lies 

to  the  left  of  right  of  one  of  the  given  lines.  Our  procedure  halts  after 

enough  queries  have  been  made  to  know  whether  the  given  point  lies  on  any 

of  the  lines  or  if  not,  which  lines  bound  the  region  in  which  it  lies.  A 

reasonable  conjecture,  given  that  any  set  of  n lines  of  the  plane  in  general 
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position  forms  exactly  +n+2)  regions,  is  that  the  searching  complexity 

of  any  set  of  n lines  in  general  position  is  the  same.  Yet,  as  we  shall 

see  in  subsequent  sections  of  this  paper  there  is  a set  of  n lines  which 

t 2 

can  be  searched  in  G'<log  n)  queries  while  another  set  is  shown  to  require 
n queries,  an  exponential  gap.  Such  anomalies  together  with  the  guiding 
principle  that  "intuition  about  geometric  problems  is  seldom  correct" 
characterize  this  as  a difficult  problem.  However,  recent  ’suits  {1,2, 3, 4, 9] 
concerning  searching  complexities  and  lower  bounds  tend  to  characterize 
kiiese  as  fruitful  areas  ui  research.  Among  Lne  results  reported  m cnese 
papers  are  upper  bounds  of  practical  importance  on  some  searching  problems 
as  well  as  lower  bounds  of  n and  n logn  queries  on  linear  search  tree 
programs  (i.e.  each  query  is  f(x)  R 0 where  f is  an  affine  function  on  the 

input  x and  R is  > , = or  < for  the  knapsack  (i.e.  Given  x^ x^,  b does 

there  exist  I £ {l,...,n}  such  that  ^£jxi  = b)  and  Element  Uniqueness 
(i.e.  Given  x^ , . . . , xn , does  there  exist  i^j  such  that  x^=Xj)  Problems. 

In  the  current  paper,  we  will  focus  our  attention  on  problems 
involving  searching  lines  in  the  plane.  Such  problems  are  of  interest  in 
themselves  as  well  as  a gateway  to  problems  involving  hyperplane  searches 
in  higher  dimensional  Euclidian  spaces.  Our  goal  will  be  one  of  classification 
of  the  complexity  of  searching  different  sets  of  lines.  Two  distinct  cases 
exist,  in  the  first  only  queries  may  be  made  of  the  original  lines  and  in 
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the  second  new  lines  may  be  added  with  queries  made  with  respect  to  the 
original  or  new  lines.  Thus,  if  we  define  c(A)  and  c(A)  as  t!je  complexity 
under  the  first  and  second  measures  of  searching  the  set  of  lines  A,  then 

c(A)  = min  c(AuB)  where  B is  any  new  set  of  lines.  Among  the  results 

B 

• I 

presented  here  are 

2 log  |a|  < c(A)  £ 3 log  | A]  for  any  set  A where  |a|  is  the 

number  of  lines  in  A. 

And  the  existence  for  each  n of  sets  A^  and  A^  of  n lines  such  that 
c(A")  £ 3/4  log2  |a"] 
c(a”)  = J A^ | = n. 

These  results  leave  us  unable  to  make  general  statements  about  the  c(.) 

function  as  we  could  about  the  c(.)  function.  Hence  we  concentrate  our 

efforts  on  methods  for  determining  for  any  set  A,  the  value  of  c(A).  To 

do  so,  it  is  necessary  to  introduce  new  ideas  to  the  standard  mathematical 

notions  of  general  position.  And  it  is  at  this  point  where  our  work  diverges 

from  the  standard  mathematical  literature  on  this  subject.  However,  we 

believe  that  some  of  the  methods  and  new  ideas  introduced  here  will,  in 

addition  to  resolving  questions  regarding  searching  lines  in  planes, 

1 2 

yield  insight  into  methods  of  extending  the  known  lower  bound  of  y n on 
the  complexity  of  the  knapsack  problem  in  n-dimensions , as  the  issues  there 
are  merely  higher-dimensional  analogs  of  those  introduced  here. 

The  organization  of  the  paper  is  as  follows.  In  the  next  section, 
the  exact  problem  which  we  are  considering  is  presented  in  detail.  The 
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concepts  briefly  spelled  out  above  are  concretely  defined.  Following 
that,  some  definitions  and  results  concerning  the  geometry  of  intersectirg 
lines  in  the  plane  are  given.  Some  of  these  results  belong  to  the 
classical  mathematical  literature  on  the  problem  while  others  were  derived 
within  the  context  of  this  problem.  Results  found  by  applying  these 

* _ 

results  to  the  problems  at  hand  are  also  surveyed. 


II.  Problem  Statement 

Searching  problems  in  the  plane  will  be  our  focus.  Such  a problem 
consists  of  a set  of  lines  dividing  the  plane  into  regions.  Our  lines  will 
be  in  general  position,  hence  no  two  are  parallel  and  no  three  have  a point 

m com"'-!"  Thus  number  cf  rcgicr.s  formed  by  a set  of  n such  lines  will 

1 2 

be  j(n  +n+2) . The  searching  problem  for  lines  in  the  plane  then  consists 
cf  determining  for  a new  point  in  which  of  these  regions  it  lies.  And  our 
goal  is  to  determine  the  complexity  of  searching  any  given  set  of  lines  in 
the  plane.  The  algorithms  we  allow  are  linear  tree  progams  which  have  been 
widely  used  before  [3,7,10,11],  Such  programs  consist  of  three  types  of 
statements,  branches  of  the  form 

S.  : if  f(x)  R 0 then  go  to  S else  go  to  S , 
k m n 

and  decision  statements  of  the  form 


S^:  point  x belongs  to  one  of  the  lines 


S^:  point  x belongs  to  region  R and  none  of  the  lines 
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where  f is  a linear  function  on  the  input  point  x,  R is  one  of  the  relations 
{<,*,>},  and  R is  a specification  of  one  of  the  regions  formed  by  the 
intersecting  lines.  The  complexity  of  such  an  algorithm  is  defined  as  the 
longest  path  from  its  root  to  any  decision  statement. 

Within  this  model,  we  consider  two  complexity  measures  on  the  searching 
of  lines.  In  the  first,  the  function  f is  restricted  to  represent  one  of 
the  original  lines.  Thus,  the  problem  here  is  to  determine  to  which  region 
a point  belongs  with  only  comparison  to  the  original  lines.  We  define  the 
complexity  of  searching  a set  of  lines,  A,  under  this  measure  as  c(A) . One 
is  tempted  to  believe  that  c(A)  ■ [ A | , the  cardinally  of  A,  but  the 
following  example  shows  otherwise. 


r 


Figure  1:  A set  of  7 lines  to  be  searched. 
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We  observe  that  on  the  left  of  L. , the  lines  L. , Lc , and  L,  do  not  intersect 
and  on  the  right,  lines  L^,  and  do  not  intersect.  Hence  if  x lies 
to  the  left  of  L^,  we  can  search  L^,  L^,  and  by  a binary  search  algorithm 
and  similarly  for  L^,  and  if  x lies  to  the  right  of  L^.  Therefore, 

in  at  most  6 comparisons  we  can  search  these  lines.  Since  all  sets  of  lines 


are  taken  to  be  in  general  position,  it  would  be  reasonable  to  assume  that 

c(A)  is  fixed  for  fixed  [ A [ . This  is  untrue,  since  a set  of  7 lines  forming 

a septagon  has  a searching  complexity  of  7.  We  shall  see  in  later  sections 

that  c(A)  varies  greatly  with  A for  fixed  f A | . 

The  second  complexity  measure  we  use  allows  for  the  introduction 

of  new  searching  objects.  The  function  f can  now  be  any  ling* in  the  plane. 

For  this  case,  we  represent  the  complexity  of  searching  a set  A of  lines  as 

c(A) . It  is  easy  to  see  that  c(A)  =>  min  c(AuB)  taken  over  all  sets  of  lines, 

B 

B.  In  a previous  paper  [2],  we  showed  that  c(a)  £ 3 log!A|  and  a simple 
region  counting  arguments  yields  c(A)  2 2 log|A|.  However  an  exact  bound 
on  c(A)  would  be  of  value  as  this  would  yield  insight  into  methods  of 
generating  better  than  information  theoretic  lower  bounds  on  searching.  In 
a related  paper,  applications  of  such  results  to  tight  bounds  on  the  knapsack 

n-n'-l.-.-  a-.-,  -1  I'! 

Throughout,  we  shall  use  r(A)  to  denote  the  largest  number  of  sides 
of  any  polygon  formed  by  intersetions  of  the  lines  in  A.  Clearly  r(A)  is 
a lower  bound  on  c(A) . 

III.  Results 

In  this  section  the  basic  structure  of  c(A) , c(A) , and  r(A)  is 
investigated.  In  addition  to  proving  a number  of  simple  but  basic  facts,  we 
also  demonstrate  that  understanding  these  functions  is  going  to  be  a non- 
trivial task.  This  follows  for  two  diffent  but  related  reasons.  First, 
the  classical  literature  on  arrangements  of  lines  in  the  plane  is  filled  with 
simple  sounding  assertions  that  are  open.  Indeed  much  of  this  literature  is 
still  trying  to  answer  questions  of  the  form  "how  many  ... 


are  there?".  In 


contrast  our  research  requires  answers  to  questions  of  the  form  "how  many 
...  are  there  and  where  are  they  with  respect  to  Second,  we  are  able 

to  prove  at  least  two  results  that  are  unexpected.  Moreover,  tnese  results 
show  that  simple  and  intuitive  arguments  about  even  the  function  c(A)  are 
possibly  going  to  be  incorrect.  In  particular  we  show  that  complexity 
behaves  poorly  with  respect  to  disjoint  union,  i.e.  there  are  disjoint 
sets  A and  B such  that 

c(AuB)  « c(A)  + c(B) 

( « means  much  smaller.  See  theorem  5 for  details.)  This  result  has  a 
similar  flavor  to  the  result  of  Schnorr  [8]  on  the  corresponding  result 
for  Bookan  circuits. 

We  first  observe  the  following  two  easy  lower  bounds  on  c(A). 

Theorem  1:  Let  A be  a set  of  lines  in  the  plane.  Then  c(A)  > r(A)  and 

c(A)  2 log2l A) . 

Proof : Recall  that  r(A)  is  the  size  of  the  largest  region  formed  by  A. 

Thus,  a simple  adversary  argument  demonstrates  the  lower  bound  of  r(A).  The 
lower  bound  of  log2lA|  is  the  usual  information  theory  argument.  0 

We  now  study  a simple  general  method  of  obtaining  upper  bounds  on 
c(A),  c(A) , and  r(A) . 

Theorem  2:  Let  A and  B be  sets  of  lines  in  the  plane.  Then 


(1)  c(AuB)  S c(A)  + c(B) 

(2)  r(AuB)  £ r(A)  + r(B). 


Proof : 


(1)  Any  search  trees  for  A and  B respectively  can  be  combined  to 
form  one  for  AuB  of  size  at  most  c(A)  + c(B) . (This  uses  the 
convexity  of  the  regions  that  A and  B form.) 

(2)  We  sketch  a proof  that  r(AuB)  5 r(A)  + r(B).  Let  R be  the  largest 

region  of  AuB;  let  r^ r^  be  the  sides  of  R.  Partition 

r,,...,r.  into  s.,...  ,s  and  t t such  that  each  s.  is 
J.  K.  I’m  J n l 

part  of  a line  from  A and  each  t^,  is  part  of  a line  from  B.  By 
a convexity  argument  we  can  show  that  there  is  a region  with  at 
least  m sides  in  A (alone)  and  one  with  at  least  n sides  in  B 
(alone).  Thus,  r(AuB)  = nrt-n  < r(A)  + r(B).  The  convexity 
argument  is  as  follows:  Consider  the  sides  s.,...,s  . Now 

1 IS 

extend  them;  they  form  a region  with  m sides.  The  other  linos 

from  A can  not  mt*  pnv  of  Q.  ...  C V»V  f-  ■(  rsr*  • knn/'A  A 

x * * m • * 

we  must  have  a region  of  at  least  m sides.  □ 


t 
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If  the  last  theorem  were  tight  one  might  hope  that  c(A)  would  be 
about  { A | . This  is  of  course  trivially  false  if  A contains  parallel 
lines.  Thus  a more  interesting  question  is:  Does  c(A)  equal  about  | A | 

for  A in  general  position?  We  know  from  [2]  that  for  c(A)  this  is  false,  i.e. 

c(A)  £ 3 log2|A|. 

We  now  show  that  c(A)  can  also  be  very  small  compared  to  | A J . Note 
before  we  continue  that  c(A)  ■ | A | is  possible  for  A in  general  position 
since  there  are  such  A with  r(A)  = | A j * 

Theorem  3:  For  any  n there  is  a set  of  lines  | A | = n in  general  position 

such  that  c(A)  = 0(log  | A | ) - 


f* 

Jk 


Proof : We  proceed  by  induction.  Let  k be  a constant  such  that  for  each 

i<n,  there  is  a set,  x^ , of  i lines  with  c(x^)  < k log^ix^l.  Construct 

a set  x as  follows: 
n 

I.  Choose  two  lines  and  L2  which  divide  the  plane  into  four  quadrants. 
II.  Choose  four  sets  A,  C and  D such  that  each  is  a copy  of  xr  2 and 

~T~ 

all  intersections  between  lines  in  A occur  within  the  first  quadrant 
formed  by  and  L2>  all  B intersections  in  the  second,  ...,  all  D 
intersections  in  the  fourth. 

This  yields  the  structure 


all  A 

all  B 

intersections 

intersections 

here 

here 

7 

all  D 

all  C 

L2 

intersections 

intersections 

here 

here 

-<7  / *' 


Note  that  we  put  no  restrictions  on  the  locations  of  intersections 
of  lines  from  different  sets. 

Now,  we  may  search  this  set  by  first  determining  in  which  quadrant 
the  point  to  be  searched  for  lies.  This  requires  2 comparisons.  Assume 

I 

without  less  of  generality  that  the  point  lies  in  quadrant  1.  We  then 
consider  the  complexity  of  searching  AuEuCuD  in  the  first  quadrant. 
However, 


(AuBuCuD)  < c^A)  + c^B)  + c^(c)  + c^CD) 


where  c^  represents  the  complexity  of  searching  in  the  first  quadrant. 

We  observe  that  c^(B) , c^(C)  and  c^(D)  are  at  most  log2(  4 ) as  t*ie  ^*-nes 

in  each  of  these  sets  have  no  intersections  in  the  first  quadrant  and 

2^2 

hence  are  totally  ordered  here.  By  induction,  c^(A)  £ k log  (— ^— ) . Hence 
c(xn)  <2+3  loe2(ii^)  + k lcg1^)  -= 

k log^n  + (3-4k)  logn  + (4k— 4)  £ k log^n  for  k £ 3/4 

Q.  E.  D. 


By  methods  of  Lipton-Dobkin  [6]  we  can  use  theorem  3 to  demonstrate 
that  there  is  a hierarchy  in  the  following  sense: 

Corollary  4:  For  any  monotone  f(n)  such  that  f(n)  £ n and  “ 

there  is  a family  {A^}  such  that  lA^I  = n and 


f(n)  S c(An)  s 0(f (n) ) . 


As  stated  earlier  we  will  now  show  that  c(A)  behaves  poorly  with 
respect  to  disjoint  union. 


Theorem  5:  For  any  nSl  there  are  |A|  ■ |Bj  * n sets  of  lines  in  general 

position  such  that  AuB  is  also  in  general  position  and 

(1)  c(AuB)  = O(log^n) 

(2)  c(A)  + c(B)  i cn  for  some  constant  c > 0. 

Sketch  of  Proof:  Let  A be  a set  of  n lines  in  general  position  such  that 

r(A)  = n;  let  R be  this  region  with  n sides.  Let  B be  the  set  of  m lines 

constructed  in  theorem  3 positioned  so  that  all  the  intersection  points 

formed  by  the  lines  of  B lie  within  R.  Now  to  search  Au3  we  proceed  as 

follows:  First,  determine  where  with  respect  to  B we  are.  This  can  be 

2 

done  in  0(log  m)  steps.  Second,  if  we  are  in  a bounded  region  of  B, 

then  we  must  lie  inside  R and  we  are  done.  On  the  other  hand,  if  we  are 

in  an  unbounded  region  we  argue  as  follows.  The  ra+1  unbounded  regions  of 

B can  be  arranged  with  respect  to  A so  that  we  can  determine  where  we 

2 

are  in  at  most  0(log  n)  additional  steps. 


r 
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